50 research outputs found
A Computationally Efficient Projection-Based Approach for Spatial Generalized Linear Mixed Models
Inference for spatial generalized linear mixed models (SGLMMs) for
high-dimensional non-Gaussian spatial data is computationally intensive. The
computational challenge is due to the high-dimensional random effects and
because Markov chain Monte Carlo (MCMC) algorithms for these models tend to be
slow mixing. Moreover, spatial confounding inflates the variance of fixed
effect (regression coefficient) estimates. Our approach addresses both the
computational and confounding issues by replacing the high-dimensional spatial
random effects with a reduced-dimensional representation based on random
projections. Standard MCMC algorithms mix well and the reduced-dimensional
setting speeds up computations per iteration. We show, via simulated examples,
that Bayesian inference for this reduced-dimensional approach works well both
in terms of inference as well as prediction, our methods also compare favorably
to existing "reduced-rank" approaches. We also apply our methods to two real
world data examples, one on bird count data and the other classifying rock
types
Unsupervised Semantic Representation Learning of Scientific Literature Based on Graph Attention Mechanism and Maximum Mutual Information
Since most scientific literature data are unlabeled, this makes unsupervised
graph-based semantic representation learning crucial. Therefore, an
unsupervised semantic representation learning method of scientific literature
based on graph attention mechanism and maximum mutual information (GAMMI) is
proposed. By introducing a graph attention mechanism, the weighted summation of
nearby node features make the weights of adjacent node features entirely depend
on the node features. Depending on the features of the nearby nodes, different
weights can be applied to each node in the graph. Therefore, the correlations
between vertex features can be better integrated into the model. In addition,
an unsupervised graph contrastive learning strategy is proposed to solve the
problem of being unlabeled and scalable on large-scale graphs. By comparing the
mutual information between the positive and negative local node representations
on the latent space and the global graph representation, the graph neural
network can capture both local and global information. Experimental results
demonstrate competitive performance on various node classification benchmarks,
achieving good results and sometimes even surpassing the performance of
supervised learning
Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data based on Information Loss Constraints
The storage, management, and application of massive spatio-temporal data are
widely applied in various practical scenarios, including public safety.
However, due to the unique spatio-temporal distribution characteristics of
re-al-world data, most existing methods have limitations in terms of the
spatio-temporal proximity of data and load balancing in distributed storage.
There-fore, this paper proposes an efficient partitioning method of large-scale
public safety spatio-temporal data based on information loss constraints
(IFL-LSTP). The IFL-LSTP model specifically targets large-scale spatio-temporal
point da-ta by combining the spatio-temporal partitioning module (STPM) with
the graph partitioning module (GPM). This approach can significantly reduce the
scale of data while maintaining the model's accuracy, in order to improve the
partitioning efficiency. It can also ensure the load balancing of distributed
storage while maintaining spatio-temporal proximity of the data partitioning
results. This method provides a new solution for distributed storage of
mas-sive spatio-temporal data. The experimental results on multiple real-world
da-tasets demonstrate the effectiveness and superiority of IFL-LSTP
Assessing the Impact of Retreat Mechanisms in a Simple Antarctic Ice Sheet Model Using Bayesian Calibration
The response of the Antarctic ice sheet (AIS) to changing climate forcings is
an important driver of sea-level changes. Anthropogenic climate change may
drive a sizeable AIS tipping point response with subsequent increases in
coastal flooding risks. Many studies analyzing flood risks use simple models to
project the future responses of AIS and its sea-level contributions. These
analyses have provided important new insights, but they are often silent on the
effects of potentially important processes such as Marine Ice Sheet Instability
(MISI) or Marine Ice Cliff Instability (MICI). These approximations can be well
justified and result in more parsimonious and transparent model structures.
This raises the question of how this approximation impacts hindcasts and
projections. Here, we calibrate a previously published and relatively simple
AIS model, which neglects the effects of MICI and regional characteristics,
using a combination of observational constraints and a Bayesian inversion
method. Specifically, we approximate the effects of missing MICI by comparing
our results to those from expert assessments with more realistic models and
quantify the bias during the last interglacial when MICI may have been
triggered. Our results suggest that the model can approximate the process of
MISI and reproduce the projected median melt from some previous expert
assessments in the year 2100. Yet, our mean hindcast is roughly 3/4 of the
observed data during the last interglacial period and our mean projection is
roughly 1/6 and 1/10 of the mean from a model accounting for MICI in the year
2100. These results suggest that missing MICI and/or regional characteristics
can lead to a low-bias during warming period AIS melting and hence a potential
low-bias in projected sea levels and flood risks.Comment: v1: 16 pages, 4 figures, 7 supplementary files; v2: 15 pages, 4
figures, 7 supplementary files, corrected typos, revised title, updated
according to revisions made through publication proces
Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods
Analyzing massive spatial datasets using a Gaussian process model poses computational challenges. This is a problem prevailing heavily in applications such as environmental modeling, ecology, forestry and environmental health. We present a novel approximate inference methodology that uses profile likelihood and Krylov subspace methods to estimate the spatial covariance parameters and makes spatial predictions with uncertainty quantification for point-referenced spatial data. The proposed method, Kryging, applies for both observations on regular grid and irregularly-spaced observations, and for any Gaussian process with a stationary isotropic (and certain geometrically anisotropic) covariance function, including the popular Matérn covariance family. We make use of the block Toeplitz structure with Toeplitz blocks of the covariance matrix and use fast Fourier transform methods to bypass the computational and memory bottlenecks of approximating log-determinant and matrix-vector products. We perform extensive simulation studies to show the effectiveness of our model by varying sample sizes, spatial parameter values and sampling designs. A real data application is also performed on a dataset consisting of land surface temperature readings taken by the MODIS satellite. Compared to existing methods, the proposed method performs satisfactorily with much less computation time and better scalability
Kryging: Geostatistical analysis of large-scale datasets using Krylov subspace methods
Analyzing massive spatial datasets using Gaussian process model poses
computational challenges. This is a problem prevailing heavily in applications
such as environmental modeling, ecology, forestry and environmental heath. We
present a novel approximate inference methodology that uses profile likelihood
and Krylov subspace methods to estimate the spatial covariance parameters and
makes spatial predictions with uncertainty quantification. The proposed method,
Kryging, applies for both observations on regular grid and irregularly-spaced
observations, and for any Gaussian process with a stationary covariance
function, including the popular \Matern covariance family. We make use of the
block Toeplitz structure with Toeplitz blocks of the covariance matrix and use
fast Fourier transform methods to alleviate the computational and memory
bottlenecks. We perform extensive simulation studies to show the effectiveness
of our model by varying sample sizes, spatial parameter values and sampling
designs. A real data application is also performed on a dataset consisting of
land surface temperature readings taken by the MODIS satellite. Compared to
existing methods, the proposed method performs satisfactorily with much less
computation time and better scalability